EUGÈNE: An Eukaryotic Gene Finder That Combines Several Sources of Evidence

نویسندگان

  • Thomas Schiex
  • Annick Moisan
  • Pierre Rouzé
چکیده

In this paper, we describe the basis of EuGène, a gene finder for eucaryotic organisms applied to Arabidopsis thaliana. The specificity of EuGène, compared to existing gene finding software, is that EuGène has been designed to combine the output of several information sources, including output of other software or user information. To achieve this, a weighted directed acyclic graph (DAG) is built in such a way that a shortest feasible path in this graph represents the most likely gene structure of the underlying DNA sequence. The usual simple Bellman linear time shortest path algorithm for DAG has been replaced by a shortest path with constraints algorithm. The constraints express minimum length of introns or intergenic regions. The specificity of the constraints leads to an algorithm which is still linear both in time and space. EuGène effectiveness has been assessed on Araset, a recent dataset of Arabidopsis thaliana sequences used to evaluate several existing gene finding software. It appears that, despite its simplicity, EuGène gives results which compare very favorably to existing software. We try to analyze the reasons of these

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

تهیه سازه‌های مناسب به‌منظور بیان ژن اینترفرون گامای انسانی در Leishmania tarentolae

 Background and Objective: The aim of the present study was to express recombinant human interferon-gamma in Leishmania tarentolae. The Leishmania expression system represents the combination of easy handling known from bacterial expression systems with the potential of a eukaryotic protein expression, folding and modification system. The trypanosomatid protozoan host Leishmania tarent...

متن کامل

Designing and construction of a DNA vaccine encoding tb10.4 gene of Mycobacterium tuberculosis

Background: Tuberculosis (TB) remains as a major cause of death around the world. Construction of a new vaccine against tuberculosis is an effective way to control it. Several vaccines against this disease have been developed. The aim of the present study was to cloning of tb10.4 gene in pcDNA3.1+ plasmid and evaluation of its expression in eukaryotic cells. ...

متن کامل

EuGène-maize: a web site for maize gene prediction

MOTIVATION A large part of the maize B73 genome sequence is now available and emerging sequencing technologies will offer cheap and easy ways to sequence areas of interest from many other maize genotypes. One of the steps required to turn these sequences into valuable information is gene content prediction. To date, there is no publicly available gene predictor specifically trained for maize se...

متن کامل

Using native and syntenically mapped cDNA alignments to improve de novo gene finding

MOTIVATION Computational annotation of protein coding genes in genomic DNA is a widely used and essential tool for analyzing newly sequenced genomes. However, current methods suffer from inaccuracy and do poorly with certain types of genes. Including additional sources of evidence of the existence and structure of genes can improve the quality of gene predictions. For many eukaryotic genomes, e...

متن کامل

Construction of an Eukaryotic Expression Vector Encoding Herpes Simplex Virus Type 2 Glycoprotein D and In Vitro Expression of the Desired Protein

To construct of an eukaryotic expression vector encoding herpes simplex virus type 2 (HSV-2) glycoprotein D (gD2), an Iranian isolate of HSV-2 was propagated in HeLa cell line and its DNA was extracted and used as template in polymerase chain reactions (PCR), to amplify gD2 gene. Primers were designed and the restriction enzyme sites for EcoRI and XhoI were considered at their 5′ ends respectiv...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2000